pcfandomcom-20200215-history
Streaming SIMD Extensions
SSE Stands for Streaming SIMD Extensions. SIMD stands for Single Instruction, Multiple Data. It is a way for the CPU to speed up operations when dealing with Data. Let's assume you have 10 pieces of Data, you'd normally need to execute the same instruction seperately for each data chunk, wasting time. Using SSE, you can skip 9 of those wasted instructions, and save CPU time for other operations. The Original SSE was for Integer Calculations, but SSE2 added support for Floating Point calculations. SSE has been succeeded by SSE2 and SSE3. SSE (Streaming SIMD Extensions) is a SIMD instruction set designed by Intel, and introduced in their Pentium III series processors as a reply to AMD's 3DNow!, which had debuted a year or so earlier. It was originally known as KNI for Katmai New Instructions (Katmai was the code name for the Pentium III). During the Katmai project, Intel was looking to distinguish it from their earlier product line, particularly their flagship Pentium II. AMD eventually added support for SSE instructions in its Athlon XP processor. Intel was generally disappointed with their first IA-32 SIMD effort, MMX. MMX had two main problems: it re-used existing floating point registers making the CPU unable to work on both floating point and SIMD data at the same time, and it worked on only integers. SSE added eight new 128-Bit registers known as XMM0 through XMM7. Each register packs together four 32-Bit single-precision floating point numbers. Because these 128-Bit registers are additional program state that the operating system must preserve across task switches, they are disabled by default until the operating system explicitly enables them. This means that the OS must know how to use the FXSAVE and FXRSTR instructions, which is the extended pair of instructions which can save all x86, MMX, 3DNow!, and SSE register states all at once. This support was quickly added to all major IA-32 operating systems. Because SSE adds floating point support, it sees much more use than MMX now that the graphics cards all handle integer calculations internally. Integer SIMD operations may still be performed with the eight 64-Bit MMX registers. The MMX registers are "aliased" on top of the eight FPU registers. Note: starting with the SSE2 version, even integers can be handled through the SSE XMM registers, so the MMX instruction set is now redundant. On the Pentium III, however, SSE is implemented using the same circuitry as the FPU, meaning that, once again, the CPU cannot issue both FPU and SSE instructions at the same time for pipelining. The separate registers do allow SIMD and scalar floating point operations to be mixed without the performance hit from explicit MMX/floating point mode switching. Intel's Pentium 4 implements SSE2, an extension to the basic SSE instruction set. The major features of SSE2 are support for double-precision (64-Bit) floating point numbers and support for integer data types in the 128-Bit vector registers introduced with SSE, allowing the programmer to avoid the MMX/FPU registers. SSE2 has itself been extended by SSE3.